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Abstract 


We present an assimilation system for atmospheric carbon dioxide (CO2) using a Global 


Eulerian-Lagrangian Coupled Atmospheric model (GELCA), and demonstrate its capability to 


capture the observed atmospheric CO2 mixing ratios and to estimate CO2 fluxes. With the efficient 


data handling scheme in GELCA, our system assimilates non-smoothed CO data from 


observational data products such as the Observation Package (ObsPack) data products as 


constraints on surface fluxes. 


We conducted sensitivity tests to examine the impact of the site selections and the prior 


uncertainty settings of observation on the inversion results. For these sensitivity tests, we made 


five different site/data selections from the ObsPack product. In all cases, the time series of the 


global net CO2 flux to the atmosphere stayed close to values calculated from the growth rate of 


the observed global mean atmospheric CO2 mixing ratio. At regional scales, estimated seasonal 


CO fluxes were altered, depending on the CO 2 data selected for assimilation. Uncertainty 


reductions (URs) were determined at the regional scale and compared among cases. 


As measures of the model-data mismatch, we used the model-data bias, root-mean-square error, 


and the linear correlation. For most observation sites, the model-data mismatch was reasonably 


small. 


Regarding regional flux estimates, tropical Asia was one of the regions that showed a 
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significant impact from the observation network settings. We found that the surface fluxes in 


tropical Asia were the most sensitive to the use of aircraft measurements over the Pacific, and the 


seasonal cycle agreed better with the results of bottom-up studies when the aircraft measurements 


were assimilated. These results confirm the importance of these aircraft observations, especially 


for constraining surface fluxes in the tropics. 


Keywords: carbon cycle, top-down approach, flux estimation, data selection, carbon dioxide, 


inversion, coupled model, flux distribution, tropical Asia 


49 


51 


52: 


53 


54 


55 


56 


57 


58 


59 


60 


61 


62 


63 


64 


65 


66 


30 November 2016, p.4 / 54 


1. Introduction 


Carbon dioxide (CO2) is a major greenhouse gas and the most important contributor to 


anthropogenic climate change. Before the industrial revolution, the atmospheric CO2 exchange 


with natural carbon reservoirs (land and ocean) was largely in balance, in the absence of human 


influences. However, the combustion of fossil fuels (coal, natural gas, and oil), as well as certain 


industrial processes and land-use changes, has considerably increased since the pre-industrial era. 


The current level of CO2 in the atmosphere has increased by nearly 40% compared to the level in 


the pre-industrial era (Conway and Tans, 2014). Currently, about half of the extra CO2 that modern 


human activities have released into the atmosphere has been absorbed by the land biosphere and 


oceans (Ciais et al., 2010a). Although global land and ocean carbon sinks increase with rising 


atmospheric CO2, the Intergovernmental Panel on Climate Change Fifth Assessment Report stated 


with high confidence that global warming will reduce the sinks and partially counterbalance the 


equilibrium. It is thus urgent to understand the current status and trends of CO2 exchange between 


land, ocean, and atmosphere so that the potential impacts of ongoing global climate change on the 


carbon cycle can be assessed. 


Inverse modeling is one approach to quantifying the spatiotemporal distribution of sources 


and sinks at the Earth’s surface; this approach starts from a set of atmospheric mixing ratio 


observations by using an atmospheric transport model and sophisticated statistical inversion 
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schemes (Ciais et al., 2010b). Global Eulerian models have been used extensively for global CO2 


inversion (e.g., Gurney et al. (2004) and references therein). Initially, Eulerian models with low 


spatial resolution (starting from 10° x 10° in the 1980s) were able to reproduce the seasonal cycle 


of global atmospheric CO2 mixing ratios reasonably well. At that time, observational network was 


much less abundant and most observations were made at weekly to monthly intervals. In 1996, the 


GLOBALVIEW-CO>z data product was provided by the U.S. National Oceanic and Atmospheric 


Administration (NOAA) Earth System Research Laboratory (ESRL) 


(http://www.esrl.noaa.gov/gmd/ccgg/globalview/). This data product contains extended records of 


COz2 with a regular temporal distribution, derived from high-precision atmospheric measurements 


such as those from the World Data Centre for Greenhouse Gases 


(http://ds.data.jma.go.jp/gmd/wdcgg/introduction.html) of the World Meteorological Organization 


Global Atmospheric Watch program and the Carbon Dioxide Information and Analysis Center 


(CDIAC; http://cdiac.esd.ornl.gov). The observational records in GLOBALVIEW products are 


free of temporal gaps and have been extensively used by many carbon cycle models. Recently, 


spatial observational coverage has been expanding as more vertical profiles and better horizontal 


coverage become available from aircraft and satellite measurements, and measurement frequency 


has been getting higher as more continuous measurements are being made at surface stations, 


including tower sites (Bruhwiler et al., 2011; Saeki et al., 2013; Houweling et al., 2015). Models 
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have been developed that are able to handle the higher frequency but irregular datasets, and such 


models have started to use actual data for inversion (e.g. Rodenbeck et al., 2003; Chevallier et al., 


2010; Chevallier et al., 2011). NOAA ESRL released a new set of observation data products in 


2012 as a successor to GLOBALVIEW, called Observation Package (ObsPack) data products 


(Masarie et al., 2014). 


To derive regional surface flux information, high-frequency observations that represent 


hourly to synoptic variations are particularly useful. Nevertheless, simulating fine spatial and 


temporal CO2 variability in the vicinity of variable sources and sinks is quite challenging. Global 


Eulerian models with high spatial resolution have a high computational cost. One way of obtaining 


higher resolution flux estimates within a region of interest is to use a “zoomed” or “nested” 


atmospheric transport model (Peters et al., 2005; Peylin et al., 2005). The idea of coupling two 


different types of models for global and regional modeling for inversion was introduced by 


Rodenbeck et al. (2009), and Trusilova et al. (2010) implemented this idea as a coupled system 


consisting of TM3, a global Eulerian atmospheric transport model and the Stochastic Time- 


Inverted Lagrangian Transport (STILT) regional Lagrangian model. Rigby et al. (2011) 


implemented a global inverse model with zoom over several regions resolved with a regional 


Lagrangian transport model NAME. Lagrangian particle dispersion models (LPDMs) are an 


effective tool for simulating observations at high spatial and temporal resolutions (Lin, 2012). 
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Lagrangian models have minimal numerical diffusion, which is inherent in Eulerian models. 


LPDMs have been coupled with numerical weather prediction (NWP) models and used extensively 


in air-pollution dispersion modeling (Uliasz, 1993). Recently, coupled LPDM/NWP models, such 


as the coupled Weather Research and Forecasting-Stochastic Time-Inverted Lagrangian Transport 


(WRF-STILT) model, have been used for a wide range of applications, including surface flux 


estimates by carbon cycle studies (Gerbig et al., 2003; Gourdji et al., 2010; Nehrkorn et al., 2010; 


Pillai et al., 2011). 


Ganshin et al. (2012) developed the Global Eulerian-Lagrangian Coupled Atmospheric model 


(GELCA) based on a framework introduced by Koyama et al. (2011). GELCA combines two 


transport models: The National Institute for Environmental Studies-Transport Model (NIES-TM) 


version 8.1i (Maksyutov et al., 2008; Belikov et al., 2013), a Eulerian global transport model, is 


coupled with FLEXPART version 8.0 (Stohl et al., 2005), a LPDM. The global background mixing 


ratio field generated by NIES-TM is used as time-variant boundary conditions for FLEXPART, 


which performs backward simulations from each receptor point (observation location). GELCA 


has demonstrated better performance in resolving short-timescale variations compared with NIES- 


TM only (Koyama et al., 2011; Ganshin et al., 2012). 


In this paper, we introduce a global CO2 inverse system using GELCA and we evaluate the 


performance of the GELCA inverse modeling system in estimating decadal global monthly CO2 
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flux distributions. As constraining observation data, we used an ObsPack data product, which 
includes actual data (whereas GLOBALVIEW contains only processed data), to take full 
advantage of the coupled modeling approach, which can effectively make use of measurements 
reflecting CO2 exchange along a local path or footprint as well as measurements representing 
hemispheric-scale background air. We examine the sensitivity of the inverse system to the data 


selection by comparing inversion results among five different subsets of the ObsPack data product. 


2 GELCA inverse modeling system 


2.1. GELCA coupled atmospheric model 


A schematic diagram of the GELCA inverse modeling framework is shown in Fig. 1. We 
implemented the coupling at temporal boundaries instead of spatial boundaries. Two-day 
backward-transported particles modeled by FLEXPART were combined at the end points with the 
background CO2 levels 2 days prior to the observations simulated by NIES-TM. The mixing ratio 
C(x,,t;,) at the receptor location x, at time t, can be expressed as the sum of near-site 
contributions calculated by FLEXPART and the background contributions calculated by NIES- 


TM. 


COq, ty) = Gieat site Gs O< t—-ts 2days) + Chackgrouna Ses t, — 2days). (1) 
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FLEXPART simulates the backward transport of 10,000 particles released from each 
receptor point (observation location). Chear site(xr,0 <t,—t <2days) is calculated by 
integrating the sensitivity of CO2 mixing ratio to the surface fluxes (footprint) along 2-day 
trajectory paths of all particles. Cpackgrouna(%r,t — 2days) is the average of the CO2 mixing 
ratios at the time of coupling simulated by NIES-TM, weighted by the number of the end points 
of the back-trajectories contained in each model grid cell. Detailed description about the Eulerian- 


Lagrangian coupling is given in (Ganshin et al., 2012). 


The duration of the backward calculations was set to 2 days to be consistent with the 
timescale of particles leaving the mixed layer (Gloor et al., 2001). Note that coupling with a 
Lagrangian model might not result in a significant improvement, compared with use of a pure 
Eulerian model, for remote sites, because numerical diffusion has a significant impact on the 
simulated mixing ratios at the receptor only if there are inhomogeneous sources or sinks near (less 
than about 2 days upwind of) the receptor. Figure S1 shows the examples of footprints in winter 
and summer for one of the observational datasets used in this study, from which we point out the 
following features. Firstly, the distribution of observation sites mostly determines the footprint 
coverage, making North America and Europe fairly well covered compared to other regions. 
Secondly, the footprint coverage varies significantly with the wind as well. In general, the coverage 


widens in winter compared to summer due to the stronger winds during winter in middle and high 
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latitudes. The wind direction is important as well. For example, in East Asia, in winter, the wind 


blows dominantly from the Siberian High towards the Pacific Ocean, whereas it blows dominantly 


from the Pacific High towards the continent in summer. Since most observation sites are located 


around the east side of the continent, more surface flux signal can be captured from the continental 


East Asia in winter than in summer (Figure S1). 


The meteorological fields driving both models were taken from the Japan 


Meteorological Agency Climate Data Assimilation System (Onogi et al., 2007), which has a 


regular horizontal resolution of 1.25° x 1.25°, 40 hybrid sigma-pressure vertical model levels, and 


a temporal resolution of 6 hours. Planetary boundary layer height data were obtained from the 


European Centre for Medium-Range Weather Forecasts Interim Reanalysis dataset (Dee et al., 


2011). 


2.2. Inversion scheme 


For long-lived trace gases such as CO2, the assumption that atmospheric mixing ratios 


respond linearly to changes in emissions holds well. Under the assumption of linearity, the 


relationship between a vector of observed values (z) and that of sources and sinks (s) can be 


expressed in matrix form as 
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z=Hs+vp, (2) 


where H is a matrix of the sensitivities of observations to changes in emissions or initial conditions 
and vy represents the model-data mismatch error, which includes both observational and model 
errors. The sensitivity of the observations to emission fields can be decomposed into two parts for 


the coupled model: 


H= Hear site + Apackgouna- (3) 


The term Aear site represents the sensitivity of the observations at a particular site to emissions 
surrounding the site as calculated by FLEXPART. The term A background represents the sensitivities 
to background emissions (i.e., the impact of emissions beyond the immediate vicinity of the site), 


which are estimated by NIES-TM. 


Using the Bayesian approach, the measure of the fit between modeled source strengths s 
and observed values z is expressed as a cost function J(s), assuming that s, z, and their uncertainties 


can be described as Gaussian probability density functions: 
_ i Tp-1 T A-1 
J(s) = 5 [ — Hs)’ R-*(z— Hs) + (s - 5) Q (s = a) (3) 


where s, is the vector of the prior source strength, R is the observation error covariance matrix 
and Q is the prior source strength error covariance matrix. The prior covariance structure describes 


the uncertainties of each regional flux, and the correlation in space of the regional fluxes. In the 
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current study, we assumed a diagonal prior covariance matrix, which means that estimated fluxes 
were assumed to show no correlation. At the minimum of J(s), the posterior source strength vector 


s and the posterior covariance matrix Q’ are expressed as 


S=S)+K(z—-Hsp), (4) 


Q' = U — KH)Q. (5) 


where the Kalman gain matrix is 


K = QH™(R+HQH’)"1. (6) 


In a batch mode inversion, all non-observed parameters are estimated using all available 
observations simultaneously at each solution step. When the number of observations and source 
regions increases, the matrix of basis functions H becomes very large, and the computational cost 
becomes very large. To avoid this large computational cost, we employed the fixed-lag Kalman 
Smoother optimization technique (Bruhwiler et al., 2005) to minimize J(s) in Eq. 3 rather than a 
full-matrix batch mode inversion. In this technique, only a subset of the transport information is 
kept at each time step, because most of the signal from source regions decays within a few months 
to half a year. The time window of the transport information kept is called the lag length. We used 
a lag length of 3 months based on the results of the numerical experiments performed by 
Maksyutov et al. (2009) on the influence of various time windows. The detailed description about 


the fixed-lag Kalman smoother applied for atmospheric inversion is given in (Bruhwiler et al., 
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2005). 


The inversion process employed in this study is illustrated in Fig. 2. The modelled CO2 


concentrations Zmoad are sum of the background concentrations zp and the presubtracted 


concentrations Zp calculated by GELCA. The calculation of zp is started from the initial CO2 


mixing ratio 3D field based on an ensemble of forward simulation results by six different transport 


models: Gap-filled and Ensemble Climatology Mean (Saito et al., 2011). Details about the prior 


fluxes used to calculate zp are given in the next section. In each inversion cycle, the modelled 


concentrations are compared to observations Zob and the state vector s is optimized with a 3-month 


window. With the response functions prepared by GELCA, posterior fluxes from step t are 


calculated from the optimized state vector, and incorporated into the background concentration for 


step t+1. 


In the inversion process, we applied our own criteria to filter outliers from datasets. We 


deselected data points for which the model-data mismatch exceeded three times o of the annual 


value of the residual standard deviation (RSD) around the smooth-fit curve of the measurements 


at each site. These data-filtering criteria worked much more effectively in keeping as many data 


while filtering obvious outliers than eliminating data points with a larger model-data mismatch 


than a certain fixed value because the filtering condition is nicely adjusted according to the normal 


variability of CO2 records at each site. 
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The calculation period was from January 2001 to December 2011. The first year was 


considered to be a spin-up period. Fluxes were solved monthly for 64 regions: 42 land regions and 


22 ocean regions (Fig. 3). 


2.3. Prior CO2 flux estimates and their uncertainties 


As prior COz fluxes, we used daily terrestrial biosphere fluxes, monthly oceanic fluxes, 


monthly fossil fuel CO2 emissions, and monthly biomass-burning emissions. The spatial 


resolution of all prior fluxes used in this study was 1° latitude x 1° longitude. The fluxes from the 


biosphere, the oceans, and fossil fuel burning were developed for the NIES Level 4 data product 


of the Greenhouse gas Observing SATellite (GOSAT) project; detailed descriptions are available 


in Maksyutov et al. (2013). 


For daily CO2 exchange between the terrestrial biosphere and the atmosphere, Net Ecosystem 


Production (NEP) of the VISIT (Vegetation Integrative SImulator for Trace gases) process-based 


biosphere model was used (Ito, 2010). The physiological parameters of the VISIT model were 


optimized by the method described by Saito et al. (2014). 


The monthly ocean-atmosphere CO2 exchange was calculated by an ocean pCO2 data 


assimilation system (Valsala and Maksyutov, 2010) based on an ocean offline tracer transport 
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model (OTTM) (Valsala et al., 2008). The OTTM was coupled to a simple biogeochemical model 


that synthesizes the surface ocean pCO2 and air-sea CO? flux by a variational assimilation method. 


Fossil fuel emissions, which were imposed in forward and inverse calculations, were obtained 


from the Open-source Data Inventory of Anthropogenic CO2 (ODIAC) emission dataset (Oda and 


Maksyutov, 2011); emission estimates were based on CDIAC the country-level estimates (~2008) 


and to the year 2008 emissions were projected up to 2011 by using data from the British Petroleum 


Statistical Review of World Energy (British Petroleum, 2012). The emissions dataset used in this 


study are available from the NIES web site (http://db.cger.nies.go.jp/dataset/ODIAC/). 


Prior estimates of CO2 emissions from biomass burning were taken from the Global Fire 


Emissions Database version 3.1 (Giglio et al., 2010; van der Werf et al., 2010). 


The prior flux uncertainty for land regions and oceanic regions were prescribed as the mean 


standard deviation of the monthly NEE calculated by VISIT for the past 30 years (1979-2009) and 


the mean standard deviation of the oceanic flux assimilated by OTTM for the period 2001-2009. 


2.4, Atmospheric CO2 observational data 


In this study, the global atmospheric CO2 data are from the package version 


obspack_co2_1 PROTOTYPE v1.0.3 2013-01-29, hereafter called the ObsPack product, which 
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includes actual CO2 measurement data from multiple observation platforms, including towers, 


aircraft, and ships, contributed by 22 laboratories from around the world. Quality control of data 


in the ObsPack products is left to the data providers, which means that the criteria for data 


selection are not uniform across each product. Most of the data are provided by the data 


providers as ‘representative of site,’ indicating that the data have been selected to represent large, 


well-mixed air masses. When there was more than one laboratory conducting the same type of 


measurements during the same time period at a given site, we chose only one (priority was given 


to NOAA). For tower sites, which provide data from multiple sampling altitudes, we used only 


data from the highest level as representative of the boundary layer mixing ratio. The 


programmable flask package, an automated grab sampler (Turnbull et al., 2012), was categorized 


as a flask sampler in this study. The details of each measurement technique are available 


elsewhere (e.g. Gomez-Pelaez and Ramos, 2011; Stephens et al., 2011). All sites used in this 


study are listed in Table 1. 


The mean annual values of RSD were used as elements of the data mismatch error 


covariance matrix. The RSD for corresponding sites are provided in the 


obspack_co2_1 GLOBALVIEW-CO2_2013 v1.0.3 2013-05-24 product (GLOBALVIEW- 


CO2, 2013). For the sites that are not included in the GLOBALVIEW product, we used the 


average RSD values of all other sites over a latitudinal zone of 20° and an altitudinal level of 1 
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km. These RSD values were also used in data filtering described in the section 2.2. The 


minimum uncertainty value was set to 0.25 ppm. 


In this study, we conducted sensitivity tests for different cases, each consisting of 


different site/data selections and observation uncertainties, to determine the impact of the 


observation settings on the inversion results. We prepared five cases, a control case dataset and 


four different subsets of the control case. The control case used all of the sites listed in Table 1, 


whereas the other four cases included only selected sites (indicated by checkmarks in the four 


right-hand columns of the table). The number of sites and types of data used in each case are 


shown in Table 2. A total of 154 sites were used in the control case, including 35 continuous 


measurement sites and 27 aircraft sites. Among 35 continuous sites, data from 29 sites were 


pretreated to give the “afternoon mean” and “nighttime mean” that is the average value of 12- 


16LT and 2-5LT, respectively. We used both afternoon and nighttime means in the control case. 


We used only data collected at 00:00 UTC and 12:00 UTC values when continuous data were 


provided at an hourly time step, which was the case for 3 JMA sites (MNM, RYO, and YON in 


Table 1.) Since these sites are "marine boundary" sites, we considered diurnal cycles were not 


significant. Among 27 aircraft sites, 26 are vertical profiles at certain locations except 


CONTRAIL (CON; Comprehensive Observation Network for Trace gases by Airliner) (Machida 


et al., 2008; Matsueda et al., 2015) of which we used data from a specific sampling mode ASE 
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(Automatic Air Sampling Equipment) that sampled at certain latitudes during the level flight 


along nearly fixed route between Narita and Sydney/Brisbane. For CON, we aggregated the data 


by 5 latitude bin between 30N — 25S, whereas for other aircraft sites, we aggregated the data by 


vertical bins. The interval of the vertical bins varied from 0.5 — 2 km, mostly following the 


interval used for the corresponding site in the GLOBALVIEW product, 2013. 


Case CT used 90 surface sites, including 22 continuous measurement sites and a 


shipboard site but no aircraft sites. This case is named Case CT because the selected sites are 


those used by CarbonTracker North America (CT2011_ 01), a COz2 measurement and modeling 


system developed by NOAA (Peters et al., 2007). For Case CT, a prior observation uncertainty 


was assigned to each observation site according to the categories defined by Peters et al. (2005); 


these uncertainties ranged between 0.75 ppm (marine boundary layer) and 7.5 ppm (difficult 


sites). Case NF used 61 surface flask sites in the NOAA ESRL Cooperative Global Air Sampling 


Network (Dlugokencky et al., 2013); it included no continuous-measurement or aircraft sites. 


The case NF was named meaning “case NOAA Flasks”. In this study, a sensitivity test was first 


conducted using the control case, Case CT, and Case NF. The observation locations of these 


three cases are shown in Fig. 4. Case SEL and Case NA were then defined on the basis of the 


inversion results obtained in the first sensitivity test. For Case SEL, three sites that showed large 


model-data mismatch values were removed from the control case. The name SEL means that the 
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“data selection” is applied. For Case NA, all aircraft data were removed from Case SEL. NA 


stands for “no aircraft data”. Details of these two cases are explained in sections 3.5 and 3.6. 


3. Results and discussion. 


3.1. Global budget/trend 


Decadal time series of the annual CO2 fluxes estimated by inversion using the five different 


observation datasets (five cases) described in section 2.4 are shown in Fig. 5. The global net fluxes 


into the atmosphere are also plotted against the global atmospheric CO2 growth rate derived 


directly from the observed CO2 (Dlugokencky and Tans, 2014) for comparison. The time series of 


the global net fluxes agreed well among the five cases and were generally consistent with the time 


series of the observed growth rate with respect to both year-to-year variations and annual mean 


values (Fig. 5a). The interannual variability of the net fluxes appeared to be strongly correlated 


with the variability in the land COz flux, shown in Fig. 5b. The large interannual fluctuations of 


the land flux correspond to El Nifio-Southern Oscillation (ENSO) phases (Fig. 5d). High growth 


rates of the CO2 mixing ratio in 2003, 2005, 2007, and 2010 were likely due to reduced CO2 


uptake by land during El Nifio phases (Jones et al., 2001; Knorr et al., 2007; Mabuchi, 2013). The 


low land CO2 uptake in 2002 is considered to be due to global dry condition during the period 


333 


334 


335 


336 


337 


338 


339 


340 


341 


342 


343 


344 


345 


346 


347 


348 


349 


30 November 2016, p.20 / 54 


(Knorr et al., 2007). The interannual variations of the estimated land flux is in phase with the 


ensemble results of nine dynamic global vegetation models (Le Quéré et al., 2013) and with the 


atmospheric inversion results of an ensemble of 11 transport models (Peylin et al., 2013) and 7 


transport models in which GELCA is included as well (Thompson et al., 2016). The increasing 


tendency of the land CO: sink in the early 2000s (Fig. 5b) was also reported by Peylin et al. (2013). 


The effect of ENSO events on the ocean CO2 flux (Fig. 5c) is not clear. In an intercomparison 


study of the air-sea CO? flux in the Pacific Ocean (Ishii et al., 2014), an association of interannual 


variation in the tropics with ENSO events was suggested by diagnostic models and ocean general 


circulation models, but it was not clear in the results of atmospheric inversions. Since global 


interannual variability of land fluxes is generally larger than that of oceanic fluxes, it is more 


challenging for atmospheric inversions to resolve the global interannual variations of oceanic 


fluxes without interference from larger atmospheric CO? fluctuations mainly caused by land fluxes. 


3.2. Regional flux distributions 


The spatial distributions of the decadal mean CO2 fluxes during 2002-2011 of the control case, 


Case CT, and Case NF are shown in Fig. 6. Although the global net fluxes agreed well among 


these three cases, at regional scales, we can see differences in the estimated CO2 fluxes among 


them due to the different observational data used in each case. The three inversion results share 
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some features in common, such as increased uptake in temperate South America and boreal Eurasia 


and increased emissions in tropical South America and southwestern Europe, compared to the prior 


fluxes. These regions, except for southwestern Europe, are poorly constrained. The results for 


tropical Asia (Region 33) are interesting. The decadal mean CO2 flux from this region was positive 


(emission) in the control case, but negative (uptake) in both Case CT and Case NF. Considering 


that the prior flux in this region is positive, the observational constraints of Case CT and Case NF 


changed the flux to negative. In the control case, which had more observational constraints than 


the other two cases, the flux was estimated to be positive. The reason for this difference is 


discussed in section 3.6. 


Table 3 shows the number of data used in the inversion in these three cases. The control case 


used twice as many observational data as Case CT and six times as many as Case NF. Just on the 


basis of the number of observations, the control case would be expected to constrain the regional 


flux estimation much better. However, not only does the effectiveness of the inversion depend on 


the amount of observational data, but it also depends strongly on the spatial (and temporal) 


coverage of the observation sites. We evaluate the effectiveness of the inversion using two 


indicators, the uncertainty reduction (UR) and the model-data mismatch, in the following sections. 


3.3. Uncertainty reduction 
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UR is a measure commonly used to evaluate the effectiveness of observational constraints in 
different regions. UR is defined as the relative difference between the prior and posterior flux 


uncertainty: 

UR= 1-2 , 

Opri 

where Opos¢ and Op,; are the quadratic means of the posterior standard deviation and the prior 
standard deviation, respectively. By definition, the more the posterior error is reduced relative to 
the prior error, the closer to 1 UR becomes, which means that more information from observations 
is provided to the inversion. Figure 7 shows the UR calculated for each region in each of the three 
cases. UR is higher in land regions in the northern mid-high latitudes, where observations are the 
most abundant in the framework of the current surface observation network, whereas UR is lower 
in the poorly covered tropical Northern Hemisphere and the whole Southern Hemisphere. The 
global pattern of the UR distribution is consistent with the UR distributions reported by Chevallier 
et al. (2010), who conducted an inversion at both grid scale (3.75° x 2.5° longitude x latitude) and 
regional scale (22 Transcom3 regions distributed worldwide). 

The control case showed higher UR than Case NF and Case CT in all regions, and the 
difference was significant in East Asia and southern Europe, where the control case had better data 


coverage. Case CT had strong constraints in North America, which is the target of the 


CarbonTracker North America project (Fig. 7b). Outside of North America, however, Case CT had 
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slightly lower URs than Case NF. Considering that most of the stations included in Case NF were 


also in Case CT, this UR difference may be due to the relatively larger prior observation uncertainty 


values assigned to sites outside North America, which resulted in the constraints in Case CT being 


weaker than those in Case NF. 


The UR became more sensitive to the exact location of each observational station as the spatial 


scale became finer. For example, the Transcom3 “Europe” region is, as a whole, relatively well 


covered by observations, but in our land mask in which the Transcom3 “Europe” is divided into 


four sub-regions, Western Europe (regions 39 and 41) is well constrained with denser observation 


coverage, whereas Eastern Europe (regions 40 and 42) is barely constrained owing to fewer 


observation sites. Therefore, the high UR in Transcom3 “Europe” is due mainly to the denser 


observation network in Western Europe. 


3.4. Model-data mismatch 


The model-data mismatch is another measure used to evaluate the effectiveness of inversion 


results. We compared the forward simulation results using the optimized fluxes with observed CO2 


mixing ratios at the observation sites and calculated three measures of the model-data mismatch: 


the model-data bias, the RMSE, and the linear correlation. The model-data bias is a systematic 


mismatch between observations and model (model minus observations) throughout the 
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observation period. The RMSE is an aggregated form of the residuals (the difference between 


simulated values and observed values). The correlation indicates the strength and direction of the 


linear relationship between model output and observed values. These three measures of the model- 


data mismatch calculated for each observation site in the control case are shown in Table 1, and 


the averaged values for all sites used in each case are shown in Table 3. 


Table 3a compares these measures among the control case, Case CT, and Case NF. In the 


control case, the global mean bias of 0.21 ppm was the smallest of the three cases. The mean 


RMSE was +1.34 ppm for the control case, +1.66 ppm for Case CT, and +1.07 ppm for Case NF. 


The differences in RMSE may reflect the fraction of continuous data in each dataset, because the 


RMSE is affected by the higher variability of continuous data compared with flask data. The mean 


correlation coefficient R was 0.962, 0.958, and 0.974 for the control case, Case CT, and Case NF, 


respectively. The model-data correlations were high for all cases, indicating overall good 


performance of the GELCA inversion system. 


The bias and RMSE for each site in the three cases are shown in Fig. 8. The observations were 


not well reproduced by the model at sites that showed high values of both bias and RMSE. Nine 


sites in the control case showed a bias larger than +1 ppm: Heidelberg (HED), Toronto (TOT), Bukit 


Kototabang (BKT), Black Sea (BSC), Lutjewad (LUT), Sutro Tower (STR), Hohenpeissenberg 


(HPB), Baltic Sea (BAL), and Point Arena (PTA). We investigated the reasons for the 
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discrepancies between observations and models at these sites. Among these nine sites, three were 


probably strongly influenced by a local CO2 flux such as urban emissions (HEI and TOT) or forest 


uptake (BKT). For sites located in cities or downwind of urban areas, the model often failed to 


reproduce sporadic sharp peaks in the observations. Continuous measurements inside urban areas 


(HEI and TOT) resulted in a significantly negative bias compared to background sites. LUT and 


STR often captured a high CO2 plume transported from urban areas of The Netherlands and San 


Francisco, respectively. In the case of BSC, the observational behavior has apparently been 


changing. The prominent seasonal cycle seen in the early 2000s gradually disappeared, and the 


frequency of significantly high mixing ratios increased in the late 2000s. These changes might 


reflect a change of either the surrounding environment (possibly increasing COz sources) or the 


measurement system. When both the topography near a site and nearby source or sink distributions 


are complicated, the model tends to express a higher mismatch, as in the cases of HPB, BAL, and 


PTA. 


GELCA showed significantly better performance compared to NIES-TM at locations that 


require finer resolution than 2.5° grid of NIES-TM. For example, two European tower sites 


Ochsenkopf (OXK; 50.0°N, 11.8°E) and Pic du Midi (PDM; 42.9°N, 0.1°E) are located close to 


the border of the model grids in which the topography is rather complicated (on the top of 


mountain). Since NIES-TM cannot resolve the topographical change within each grid, the forward 
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simulation doesn’t fit observation well. On the other hand, GELCA handles the simulation in the 


vicinity of the observation sites with FLEXPART, resulting in much better fit at these difficult sites. 


The observed and simulated CO2 time series for OXK and PDM are shown in Figure S2. The 


performance of GELCA depends on site-specific conditions (e.g., source and sink distributions in 


the vicinity and topographic features), and should be further investigated in future studies. 


3.5. Data selection to reduce observational noise 


Based on the results reported in section 3.4, we designed a new subset called Case SEL to 


minimize noise from observations. To avoid strong local influences, data from BSC, HEI, and TOT 


were excluded from Case SEL. We also applied temporal data selection to seven continuous sites 


located near source or sink areas. Only afternoon averages were used from the tower sites Boulder 


Atmospheric Observatory (BAO), Moody (WKT), Beech Island (SCT), Park Falls (LEF), West 


Branch (WBJ), and Walnut Grove (WGC), and the Pallas-Sammaltunturi (PAL) surface site (PAL), 


to exclude local extreme values in the stable boundary layer at night. In contrast, only nighttime 


averages were used from a mountain site, Shenandoah National Park (SNP; 1008 m above sea 


level) to minimize the bias from local sources or sinks. Temporal data selection has been used in 


previous studies carried out since the TransCom Continuous experiment (Peters et al., 2007; Law 


et al., 2008; Patra et al., 2008; Chevallier et al., 2010). 
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Figure 9b shows the inversion results for Case SEL. The decrease in biospheric emissions from 


southwestern Europe (region 39) compared to the control case is the most prominent feature, 


whereas the impact of Case SEL was not significant in southeastern Europe (region 40). The 


decadal mean decrease of biospheric emissions was 0.076 + 0.024 PgC/region/year in 


northwestern Europe, and 0.040 + 0.026 PgC/region/year in southwestern Europe; both values 


correspond to a 41% change from the estimated regional fluxes in the control case. This result 


indicates that HEI, BSC, and PAL significantly affected the inversion results for Western Europe. 


Estimation of finely distributed anthropogenic and natural sources and sinks in Western Europe 


may need higher spatial and temporal resolution of both prior fluxes and transport simulation. In 


contrast, in North America, there was no significant difference between the control case and Case 


SEL. This result shows that the temporal data selection of continuous tower observations and the 


removal of TOT did not significantly affect the flux estimation in North America. 


3.6. Effect of aircraft observations on flux estimates in tropical Asia 


Here we discuss the large difference in terrestrial biosphere fluxes from tropical Asia (region 33) 


among the prior and three posterior fluxes described in section 3.2. The decadal mean flux and UR 


for this region in the control case, Case CT, and Case NF are shown in Fig. 10. In tropical Asia, 


only one observation site, BKT in Indonesia, was used in this study (Fig. 4). We set the observation 
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uncertainty for BKT to 2.8 ppm for the control case and Case NF; this value is derived from the 


RSD values of the data record at site BKT in the ObsPack GLOBALVIEW product. For Case CT, 


the observation uncertainty for BKT was set to 7.5 ppm, which is the maximum uncertainty in the 


CarbonTracker model, because of its relatively large model-data mismatch (Peters et al., 2005). 


The higher UR for Case NF than Case CT (Fig. 10) can be explained by the smaller prior 


uncertainty assigned to BKT as well as by additional constraints from the Western Pacific Cruise 


(WPC; shipboard observations in the western Pacific Ocean; Fig. 4c) during 2004, which may 


have detected flux signals from tropical Asia. 


As shown in Fig. 8, BKT showed the largest positive bias among all sites used in this study. A 


similar large positive bias for BKT has been found by many other atmospheric inversion studies 


as well (e.g. CarbonTracker Team, 2014). The flask sampling at BKT is conducted on a weekly 


basis, usually around 14:00 LT, when the CO2 hourly average mixing ratio reaches its minimum 


value (Nahas, 2012). Because the observation site is surrounded by a tropical rainforest, the 


samples may be more representative of the daily minimum mixing ratio, which reflects uptake by 


local vegetation, than of the daytime large-scale boundary condition. Thus, the net CO2 uptake in 


tropical Asia in Case CT and Case NF may be largely due to the BKT observations. 


In contrast to Case CT and Case NF, the control case yielded net CO2 emissions in tropical 


Asia even though it used BKT data. The UR of the control case was higher not only in tropical 
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Asia but also in the overall tropical southern Pacific Ocean, compared to Case NF and Case CT. 


This spatial distribution difference of UR suggests that net CO2 emissions in tropical Asia in the 


control case might result from the observational constraints in the tropical southern Pacific Ocean, 


which were used only in the control case. These observational constraints are aircraft data such as 


Rarotonga (RTA; 21.25°S, 159.83°W) and CON, which were included only in the control case. 


Therefore, we hypothesize that the aircraft data affected the inverted flux for the tropical Asia 


region in the control case. The measurement periods of RTA and CON are 2001-2011 and 2001- 


2009, respectively. The frequencies of both observations are by-weekly on average. 


To test this hypothesis, we conducted another sensitivity test by removing all aircraft observations 


from the control case. Without aircraft data, the decadal mean regional flux in tropical Asia became 


negative (Fig. 9c). This result supports our hypothesis that the aircraft data strongly constrained 


the CO? flux estimate in this region. However, the differences did not appear to be significant in 


the oceans and other land regions. To check the sensitivity to aircraft data in detail, differences 


between decadal mean regional fluxes estimated with (Fig. 9a) and without (Fig. 9c) aircraft data 


are shown in Fig. 11. The flux difference in tropical Asia (region 33 in Fig. 11a) stands out among 


the regions. Among oceanic regions, the largest flux difference was found in South Pacific north 


(region 50 in Fig. 11b). This sensitivity analysis indicates that tropical Asia and its neighboring 


ocean regions are the areas most sensitive to the aircraft data used in the inversion. 
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To investigate how surface fluxes from tropical Asia are transported, we calculated the distribution 


of atmospheric CO2 at three vertical levels, approximately 990, 500, and 250 hPa, from monthly, 


pulse emission from the region (annual mean is shown in Fig. 12). We kept a constant CO2 source 


(spatially distributed according to the multiple year mean of NEP from VISIT) in tropical Asia, 


and the transport model tracked its mixing ratios over the course of a month. The results shown in 


Fig. 12 indicates that the signal from surface fluxes in tropical Asia could be detected by aircraft 


observations in the mid/upper troposphere through vertical convection and the consequent rapid 


horizontal transport in the free troposphere. This active convection in tropical Asia as part of the 


Walker circulation must be a key process connecting surface fluxes and aircraft observations. 


We next examined the impact of the aircraft data on the seasonality of terrestrial biospheric fluxes 


from tropical Asia. The decadal mean seasonal cycle derived from the inversion using the aircraft 


data (control case) and the inversion without using the aircraft data (Case NA) are plotted with the 


prior flux in Fig. 13. The flux estimates in Case NA became significantly negative (sink) compared 


with the prior fluxes, whereas the seasonal estimates were mostly positive (source) in the control 


case. This might be due to the increased effect of the negative bias from BKT observation when 


we don’t us aircraft measurements. A major difference in the estimated fluxes between the control 


case and Case NA was found during two periods: May—June and November—January. During May— 


June, the estimated flux was almost zero in the control case, whereas Case NA estimated a sink. 


530 


532 


533 


535 


536 


537 


538 


539 


540 


541 


542 


543 


544 


545 


546 


547 


30 November 2016, p.31 / 54 


During November—January, the control case estimated large emissions, but Case NA estimated 


much lower emissions in November—December and even uptake in January. Niwa et al. (2012) 


have also pointed out CO2 emissions in tropical Asia during October—January in their atmospheric 


inversion for the period 2006-2008 using CONTRAIL data that were enhanced compared to an 


estimate made by using only ground-based data (GLOBALVIEW-COz2). Niwa et al. (2012) used 


CONTRAIL CME (Continuous CO2 Measuring Equipment) data, which were binned and monthly 


averaged after smoothing and gap-filling, and the inversion was conducted with the NICAM-TM 


(Nonhydrostatic Icosahedral Atmospheric Model-based Transport Model). We used only 


CONTRAIL ASE data without preprocessing, and we conducted the decadal inversion by GELCA. 


The decadal inversion results in this study confirmed the strong impact of aircraft data on surface 


flux estimates in tropical Asia. 


To further evaluate the seasonality of the estimated fluxes for tropical Asia, we compared our 


results with bottom-up studies. Among the limited number of bottom-up studies in this region, the 


seasonal cycle of NEP was estimated by continuous CO2 flux measurements using the eddy 


covariance technique in tropical peat swamp forests in Central Kalimantan (Hirano et al., 2007; 


Hirano et al., 2012). These estimates suggest that the CO2 flux is positive during the rainy season 


(November—April) and the late dry season (August—October), whereas it is nearly neutral or 


slightly negative during the early dry season (May—July). The neutral flux in the early dry season 


548 


549 


550 


553 


554 


555 


556 


557 


558 


559 


560 


562 


563 


564 


565 


30 November 2016, p.32 / 54 


and higher emissions during the early rainy season were also seen in the seasonal cycle of the 


control case (Fig. 13). The seasonal cycle of the control case agrees better with the results from 


the bottom-up study than Case NA. This result indicates that the inversion with aircraft data 


captures well the seasonal signals of the regional land biosphere. Both our inversion results and 


those of the top-down study by Niwa et al. (2012) agree better with independent bottom-up studies 


when aircraft data are included; thus, aircraft observations play a key role in constraining CO2 flux 


estimates in tropical Asia. 


4. Summary and conclusions 


We presented an assimilation system for atmospheric CO2 using GELCA, and we demonstrated 


its ability to capture observed atmospheric CO2 mixing ratios and to estimate CO? fluxes. In this 


study, to take full advantage of the data handling efficiency of GELCA, we used non-smoothed 


observational data from ObsPack as constraints. ObsPack includes various types of direct 


atmospheric CO2 measurements, continuous tower measurements, and aircraft measurements, 


provided by a large number of laboratories around the world. 


We conducted sensitivity studies to examine the impact of the observation network setting on the 


inversion results and to optimize the site/data selection to minimize noise while optimizing the 


signal from the extensive observation dataset. We selected five different sets of sites/data from 
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ObsPack: 1) comprehensive dataset (control case); 2) data selection conformed to the 


CarbonTracker North America project (Case CT); 3) data selection conformed to the NOAA ESRL 


Cooperative Global Air Sampling Network (Case NF); 4) data selection according to the model— 


data mismatch of the inversion results of the control case (Case SEL); and 5) Case SEL without 


aircraft sites (Case NA). 


For all cases, the time series of the global net flux to the atmosphere were similar to that of the 


fluxes calculated from the growth rate of the observed global mean atmospheric CO2 mixing ratio. 


At regional scales, estimated seasonal CO2 fluxes were altered by the selection of assimilated CO2 


data. UR was derived at regional scale and compared among cases. In all regions, UR was higher 


in the control case than it was in Case CT and Case NF. Case CT showed considerably higher UR 


in North America, whereas outside of North America, Case NF showed slightly higher UR than 


Case CT. We employed three measures of model-data mismatch between the forward simulation 


results using the posterior fluxes and the observed CO2 mixing ratios: the model-data bias, RMSE, 


and the linear correlation. For most observation sites, the model-data mismatch was reasonably 


small (global mean bias, 0.21 + 0.03 ppm; mean RMSE, 1.38 + 0.23 ppm; correlation coefficient 


R>0.9 for 91% of all used sites). There were some sites with a larger model-data mismatch, caused 


mostly by local conditions. 


Surface fluxes in tropical Asia were found to be the most sensitive to the use of aircraft 
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measurements in the inversion. The seasonal cycle agreed better with the results from bottom-up 


studies when aircraft measurements were used. These results confirm the importance of aircraft 


observations, especially in constraining surface fluxes in the tropics. 


Overall, we found GELCA to be capable of handling various types of observations provided in 


ObsPack, and its performance in reproducing observed concentrations was good, with reasonably 


small model-data mismatches. The sensitivity studies indicated that the reduction of uncertainty in 


COz flux estimation could be improved by expanding the observation network. In particular, the 


study results highlighted the impact of aircraft measurements over the Pacific on surface flux 


estimation in tropical Asia. This study evaluated the basic performance of GELCA as an 


assimilation tool for top-down CO2 flux estimation. Studies are now underway, for example, to 


integrate more observations (e.g., satellite data) into GELCA and to analyze certain regional 


carbon flux estimations. Our future plans include optimization of GELCA's settings (e.g., the 


duration of backward simulation by FLEXPART, temporal/spatial resolutions, and preprocessing 


of certain types of data) according to the specific aims of an investigation. 
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Figure Captions 


Figure 1. Schematic diagram of GELCA inverse modeling framework. 


Figure 2. Illustration of the inversion process employed in this study. The t indicates the time 


step on monthly basis. The modelled CO2 concentrations Zmoa are sum of the background 


concentrations zp and the presubtracted concentrations zp calculated by GELCA. In each 


inversion cycle, the modelled concentrations are compared to observations Zob and the state 


vector s is optimized within a 3-month window. Optimized fluxes are incorporated into the 


background concentration (z»’) before calculating for the next time step. The number of asterisks 


in the upper right of s shows how many times a set of monthly fluxes has been optimized 


previously from past cycles. The prime in the upper right of zp» means that the zp has been 
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updated. The dashed arrows mean monthly calculations by GELCA. 


Figure 3. Definitions of the 64 regions used in the inversion. 


Figure 4. Map showing the observation site locations of the different site selection cases: (a) 
control case (all symbols), Case SEL (green symbols removed), and Case NA (red symbols 
removed); (b) Case CT and (c) Case NF. Symbol shapes indicate the type of sampling: O, 


surface discrete; +, surface continuous; W, ship; ©, aircraft. 


Figure 5. Comparison of global annual mean posterior fluxes: (a) net, (b) land biosphere, and (c) 
ocean. (d) Multivariate ENSO Index (MEI) (Wolter and Timlin, 1993) for 2002—2011. Positive 
fluxes indicate emission and negative fluxes indicate uptake. In (a), the global annual mean 
atmospheric CO2 growth rate is shown with net fluxes. The CO2 growth rate in ppm are 
converted to the emission rates in Pg of carbon with a conversion factor of 2.12 PgC ppm’! via 
simple molecular weight considerations. In (b) and (c), the global annual mean prior fluxes for 


land biosphere and ocean are shown respectively. 


Figure 6. Decadal mean (2002-2011) spatial distributions of posterior fluxes for (a—c) land and 
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(e-g) ocean regions: (a, e) control case, (b, f) Case CT, (c, g) Case NF. Prior fluxes from the (d) 


land biosphere and (h) ocean. Positive fluxes indicate emission and negative fluxes indicate 


uptake. 


Figure 7. Uncertainty reductions by region: (a) control case, (b) Case CT, and (c) Case NF. 


Figure 8. Model-data mismatch for observation sites after inversion: (a) control case, (b) Case 


CT, (c) Case NF. The color and size of the colored circles indicate the bias and the RMSE, 


respectively. The size of the open circles indicates the prior uncertainty value. 


Figure 9. Comparison of decadal mean (2002-2011) spatial distributions of posterior fluxes for 


the land biosphere (left panels) and ocean (right panels): (a) control case, (b) Case SEL, (c) Case 


NA. Positive fluxes indicate emission and negative fluxes indicate uptake. 


Figure 10 (a) Prior and posterior land fluxes and (b) uncertainty reduction (UR) in tropical Asia 


(Region 33) in the control case, Case CT, and Case NF. Positive fluxes indicate emission and 


negative fluxes indicate uptake. 
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Figure 11. Differences between estimated annual mean regional CO? fluxes from the (a) land 


biosphere and (b) ocean derived with and without aircraft observations (control case — Case NA) 


during 2002—2011. The numbered regions are shown in Figure 3. Positive fluxes indicate 


emission and negative fluxes indicate uptake. 


Figure 12. Annually averaged atmospheric CO2 distributions at (a) 990 hPa, (b) 500 hPa, (c) 250 


hPa, calculated from monthly pulsed emission from tropical Asia (Region 33) in 2008. 


Figure 13. Monthly mean land biosphere posterior fluxes (control case - red; Case NA - green) 


and prior fluxes (VISIT - gray), averaged over 2002-2011. Positive fluxes indicate emission and 


negative fluxes indicate uptake. 


Supplementary Figures 


Figure S1. The footprint of 2-day backward trajectory simulation by FLEXPART in (a) January 


and (b) July 2009, for the ground observation dataset used in the control case in this study 


(obspack_co2_1 PROTOTYPE _ v1.0.3 2013-01-29) was used. 
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869 Figure S2. COz2 time series at (a) Ochsenkopf (OXK) and (b) Pic du Midi (PDM) simulated by 


870 GELCA (red circle) and NIES-TM (blue circle), along with the observations (gray circle). 


871 
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Tables 


Table 1. Observational sites used in this study, including the model-data mismatch results for the control case inversion. Check marks in the last four columns show whether 


the site is used in the indicated case. 


Platform, Correlation 
Code Site name Lab. Lat.” Long.?_— Altitude (m)?__— Bias (ppm) RMSE (ppm) CT NF SEL? NA} 
Sampling! coefficient R 
ABP Arembepe, Bahia NOAA g f 12.76S  38.16°W 6 0.34 0.41 0.980 Vv Vv Vv 
NOAA g f 82.45N 62.51W 205 Vv Vv Vv 
ALT Alert, Nunavut —0.22 0.69 0.997 
EC g i 82.45N 62.51°W 210 Vv Vv Vv 
NOAA gp 45.03N 68.68 W 157 Vv Vv Vv 
AMT Argyle, Maine 0.37 3.72 0.928 
NOAA t i 45.03N 68.68 W 160 Vv Vv Vv 
ASC Ascension Island NOAA gf 7.978 14.40°W 90 0.29 0.50 0.997 Vv Vv Vv 
ASK Assekrem NOAA g f 5.63E 1847 —0.05 0.52 0.997 Vv Vv Vv 
AZR Terceira Island, Azores NOAA g f 27.38°W 24 0.24 1.18 0.985 Vv Vv Vv 
BAL Baltic Sea NOAA gf 17.22°E 28 —1.86 3.72 0.926 ¥ v Vv Vv 
Boulder Atmospheric NOAA gp 105.00°W 1584 ¥ av Vv Vv 
BAO -1.51 3.28 0.783 
Observatory, Colorado NOAA t i 105.00°W 1884 v an “ae 
BGI Bradgate, Iowa NOAA ap 94.41 W 600-8100 —0.76 — 0.09 0.52 — 3.87 0.879 — 0.968 Vv 
BGU Begur LSCE gf 3.23°E 13 -1.01 3.24 0.926 Vv Vv 
BHD Baring Head Station NOAA gf 174.87E 95 —0.27 0.52 0.995 ¥ v v v 
BKT Bukit Kototabang NOAA g f 100.32E 850 3.54 3.12 0.813 Vv Vv Vv 
St. Davids Head, 
BME NOAA gf 32.37N  64.65°W 17 0.22 1.19 0.982 ¥ v Vv Vv 
Bermuda 
BMW Tudor Hill, Bermuda NOAA gf 32.26N  64.88°W 60 0.15 1.06 0.991 Vv Vv Vv 
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Beaver Crossing, 


Nebraska 


Bratt's Lake 


BRA EC gi 512N 104.7W 630 —0.03 3.15 0.908 Vv Vv 
Saskatchewan 
NOAA g f 71.32N 156.61 W 28 Vv Vv Vv Vv 
BRW Barrow, Alaska —0.28 0.81 0.995 
N g 
BSC Black Sea, Constanta NOAA g 
Briggsdale, Colorado N a 


Candle Lake, 


Saskatchewan 


Cape Ferguson, 
CFA CSIRO gf 19.28S  147.06E 5 0.28 0.72 0.993 Vv v v 
Queensland 


144.69E 164 


CGO Cape Grim, Tasmania NOAA g f 


Chibougamau, Quebec 
CHR Christmas Island NOAA g f 1.70N 157.15°W 5 —0.27 0.30 0.998 Vv Vv Vv 


Centro de Investigacion 


de la Baja Atmosfera 


CMA — Cape May, New Jersey NOAA ap 300 — 8200 0.894 — 0.988 


CON CONTRAIL NIES/MRI a f 3500 — 12200 


CRI Cape Rama CSIRO gf 15.08°N 73.83°E 66 —0.53 1.80 0.981 Vv Vv 
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Crozet Island 


46.43°S 


51.85°E 


S 
S 
S 
S 


Egbert, Ontario 
EIC Easter Island 


Estevan Point, British 


ESP 


Columbia 


109.43°W 


126.54W 


126.54W 


126.54W 


0.88 


1.46 0.962 


— 1.36 


East Trout Lake, 


Saskatchewan 


104.98°W 
104.98°W 


—0.01 0.94 


—1.77 


—0.02 0.45 


— 0.80 


—0.79 0.59 


— 3.05 


Molokai Island, Hawaii 


Halley Station, 


Antarctica 


Hidden Peak (Snowbird), 
Utah 


40.56°N 


111.65°W 


S 
‘o 
oa 
ea 
| 
S 
Ne) 
S 
n 


30 November 2016, p.49 / 54 


HEI Heidelberg UHEI-IUP gi 49.42N 8.68E 146 4.37 8.17 0.855 
Harvard Forest, 
HFM NOAA ap 42.54N 72.17W 600-8100 —0.21—0.22 0.66 — 2.43 0.959 — 0.991 Vv 
Massachusetts 
HIL Homer, Illinois NOAA ap 40.07N 87.91°;W 600-8100 —0.24--0.17  0.72-—2.98 0.926 — 0.992 Vv 
HPB Hohenpeissenberg NOAA gf 47.80°N 11.02E 990 2.96 5.03 0.854 ¥ vo v v 
NOAA g f 46.95°N 16.65E 344 ¥ va Vv Vv 
HUN Hegyhatsal 0.27 5.28 0.910 
HMS t i 46.95°N 16.65E 363 Vv Vv 
Storhofdi, 
ICE NOAA gf 63.40N 20.29W 100 —0.40 0.81 0.995 ¥ wv Vv Vv 
Vestmannaeyjar 
Izana, Tenerife, Canary NOAA gf 28.3 1°N 16.50°W 2378 v v Vv 
IZO —0.16 0.69 0.995 
Islands AEMET gi 28.31N 16.50 W 2381 Vv Vv 
JFJ Jungfraujoch KUP gi 46.55°N T.ISE 3580 —0.05 1.92 0.940 Vv Vv 
KEY Key Biscayne, Florida NOAA gf 25.67N  80.16°W 6 0.07 0.82 0.993 ¥ va V Vv 
KUM — Cape Kumukahi, Hawaii NOAA gf 19.52N 154.82°W 8 —0.34 0.77 0.994 ¥ vo Vv Vv 
KZD Sary Taukum NOAA g f 44.08°N 76.87TE 595 —1.46 2.50 0.946 ¥ vo Vv Vv 
KZM Plateau Assy NOAA gf 43.25°N 77.88E 2524 0.59 1.62 0.97 ¥ av Vv Vv 
NOAA gp 45.95N 90.27W 715 Vv Vv Vv 
—0.05 3.29 0.939 
LEF Park Falls, Wisconsin NOAA t i 45.95N 90.27W 868 Vv v3 v3 
NOAA ap 45.95N 90.27W 600-4000 —0.29 — 0.53 1.11 -—3.63 0.927 — 0.985 Vv 
LJO La Jolla, California SIO g f 32.9N 117.3°W 20 0.70 0.78 0.993 Vv Vv 
NOAA gf 54.95N 112.45 W 546 Vv Vv Vv 
LLB Lac La Biche, Alberta —0.38 4.03 0.912 
EC gi 54.95N 112.45°W 550 Vv Vv Vv 
LMP Lampedusa NOAA gf 35.52°N 12.62°E 50 —0.41 1.52 0.941 ¥ wv Vv Vv 
LPO Tle Grande LSCE gf 48.8°N 3.58°W 20 —0.43 2.35 0.93 V Vv 
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Lutjewad, Netherlands 


Mawson Station, 


Antarctica 


High Altitude Global 
Climate Observation 


Center 


Worcester, 


Massachusetts 


Orleans 


Otway, Victoria 


6.35°E 61 3.11 5.40 0.81 Vv Vv 
62.87E 42 —0.27 0.24 0.999 Vv Vv Vv 
97.31°;W 4469 0.38 1.15 0.937 Vv Vv Vv 
9.90°W 26 —0.17 0.95 0.993 ¥ av Vv Vv 
177.38°W 11 0.18 0.80 0.994 ¥ av Vv Vv 
37.30E 3649 0.61 1.78 0.939 ¥ av Vv Vv 
155.58W 3402 ¥ av Vv Vv 
—0.14 0.49 0.997 
155.58 W 3437 Vv Vv Vv 
153.98°E 28 —0.07 0.66 0.996 Vv Vv 
158.97E 13 —0.18 0.31 0.999 Vv Vv Vv 
35.26°W 20 —0.38 0.80 0.792 Vv Vv Vv 
70.63°W = 200 — 8000 —0.52 — 0.14 0.80 — 2.67 0.947 — 0.990 Vv 
15.03E 461 0.31 0.56 0.991 Vv Vv Vv 
105.59W 3526 ¥ v Vv Vv 
0.10 1.16 0.980 
105.59 W 3528 Vv Vv Vv 
36.60E 484 —0.35 2.54 0.956 Vv Vv Vv 
88.94W 500-8100 —0.11— 0.36 0.65 — 3.17 0.874 — 0.964 Vv 
2.5E 200-6000 —0.26 — 0.82 0.97 — 4.08 0.882 — 0.986 Vv 
142.82°E 50 —0.44 0.32 0.992 Vv Vv 
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OXK Ochsenkopf NOAA gf 50.03°N 11.81E 1185 0.03 3.84 0.889 V V Vv Vv 
Pallas-Sammaltunturi, NOAA g f 67.9TN 24.12E 565 Vv v Vv Vv 
PAL —0.31 2.03 0.973 
GAW Station FMI gi 67.97N 24.12E 565 Vv Vv 
PDM Pic Du Midi LSCE gf 42.94N O.14E 2877 —0.45 1.73 0.971 Vv Vv 
PFA Poker Flat, Alaska NOAA ap 65.07N 147.29W 100-7600 —0.35-—0.01 0.73 — 0.97 0.990 — 0.995 Vv 
POC Pacific Ocean NOAA s f 20 —0.36 — 0.30 0.31 —0.55 0.994 — 0.998 Vv v “4 V 
Palmer Station, 
PSA NOAA gf 64.92°S_  64.00°W 15 —0.16 0.27 0.999 Vv Vv Vv Vv 
Antarctica 
PTA Point Arena, California NOAA gf 38.95N 123.74W 22 2.47 2.93 0.93 Vv Vv Vv V 
RBA Roof Butte, Arizona NCAR gi 36.46N 109.10°W 3004 0.04 1.13 0.928 Vv Vv 
RPB Ragged Point NOAA gf 13.16N 59.43°W 20 —0.08 0.42 0.998 Vv Vv Vv Vv 
RTA Rarotonga NOAA ap 21.25°S  159.83°W 15 — 6500 —0.30—0.12 0.36—-0.51 0.998 — 0.999 Vv 
RYO Ryori JIMA gi 39.03N 141.82E 280 —0.47 1.65 0.977 Vv Vv 
NOAA ap 2.85S 54.95W 100-5200 Vv 
SAN Santarem —0.54 — 0.66 0.47 — 2.16 0.935 — 0.996 
IPEN a f 2.85S 54.95W 100-4400 Vv 
Charleston, South 
SCA NOAA ap 32.77N 79.55 W 200-—13300 -0.39--0.14 0.56-—2.81 0.912 — 0.994 Vv 
Carolina 
Beech Island, South NOAA gp 33.41N 81.83°W 420 Vv Vv 
SCT ; —0.31 3.90 0.850 
Carolina NOAA 1. 41 33.41N  81.83°W 420 Vv Vv v? 
SEY Mahe Island NOAA g f 4.68°S 55.53°E 3 —0.18 0.57 0.996 Vv V Vv V 
NOAA g f 36.61N 97.49W 374 Vv Vv Vv Vv 
Southern Great Plains, 0.24 3.35 0.913 
SGP LBNL gi 36.61N 97.49W 374 Vv Vv Vv 
Oklahoma 
NOAA ap 36.61N 97.49W 200-—13000 -0.34--0.03 0.75-—3.29 0.849 — 0.984 Vv 
SHM Shemya Island, Alaska NOAA gf 52.71N = 174.13°E 28 —0.55 1.05 0.992 V Vv Vv V 
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SIS Shetland Islands CSIRO g f 60.09°'N 1.25°W 30 0.11 0.73 0.986 Vv Vv Vv 
NOAA gf 14.25S  170.56W 47 Vv Vv Vv 
SMO Tutuila 0.16 0.31 0.999 
NOAA gi 14.25S 170.56W 60 Vv Vv Vv 
Shenandoah National 
SNP NOAA t i 38.62N 78.35°W 1025 0.35 2.61 0.926 Vv An 4 
Park 
Storm Peak Laboratory 
SPL (Desert Research NCAR gi 40.45N 106.73°W 3219 —0.52 1.59 0.949 Vv Vv Vv 
Institute) 
NOAA gf 89.98S 24.80 W 2821 Vv Vv Vv 
SPO South Pole, Antarctica —0.15 0.12 1.000 
NOAA gi 89.98S 24.80 W 2821 Vv Vv Vv 
STM Ocean Station M NOAA g f 66.00°N 2.00°E 7 vv Vv Vv 


Sutro Tower, San 
STR NOAA gp 37.76N 122.45°W 486 —1.97 4.86 0.721 Vv v v 
Francisco, California 


SUM Summit 


Syowa Station, 


SYO 
Antarctica 
TAP Tae-ahn Peninsula 
Tierra Del Fuego, 
TDF NOAA g f 68.31°W 32 —0.29 0.46 0.997 V V “4 Vv 
Ushuaia 


Sinton, Texas 
Trinidad Head, NOAA g f 124.15W 112 —1.24 2.4 0.934 ¥ vv Vv Vv 


124.15 W 200-8100 —0.30 — 0.16 0.60 — 1.02 0.987 — 0.992 Vv 


California NOAA ap 


TRN Trainou LSCE 141 
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ULB Ulaanbaatar NOAA ap 4TAN 106.0E 1500-6000 —0.12 — 0.36 0.11 — 1.18 0.975 — 0.980 V 
UTA Wendover, Utah NOAA gf 39.90N 113.72;W 1332 0.08 1.76 0.969 Vv Vv v Vv 
UUM Ulaan Uul NOAA gf 44.45N 111.10 1012 -0.71 2.12 0.967 v Vv Vv Vv 
NOAA g p 41.72°N 91.35 W 621 Vv Vv Vv 
—0.28 4.04 0.887 
WBI West Branch, Iowa NOAA t i 41.72N 91.35W 621 V Y? v3 
NOAA ap 41.72°N 91.35 W 600-8200 —0.13--0.03 0.64-—2.77 0.915 — 0.991 Vv 


Walnut Grove, NOAA gp 38.27N 121.49°W 91 Vv Vv Vv 
WGC —0.99 4.35 0.721 
California NOAA t i 38.27N 121.49W 483 Vv An fae 
WIS Station, Negev 
WIS NOAA gf 30.86°N 34.78E 482 —0.25 1.73 0.973 ¥ vo Vv Vv 
Desert 
NOAA gp 31.31N  97.33°W 708 Vv Vv Vv 
WKT Moody, Texas 0.16 2.88 0.895 
NOAA t 1 31.31N 97.33°;W 708 Vv Ann fae 
WLG Mt. Waliguan NOAA gf 36.29N 100.90°E 3815 —0.33 1.10 0.987 ¥ va Vv Vv 
Sable Island, Nova 
WSA EC gi 43.93N  60.02°W 30 —0.46 2.27 0.955 Vv Vv Vv 
Scotia 


ZEP Ny-Alesund, Svalbard NOAA g 78.91°N 11.89E 479 —0.02 0.86 0.995 ¥ av Vv Vv 


1. Platform and sampling method: g, surface; a, aircraft; t, tower; s, shipboard; f, flask; 1, continuous; p, programmable flask package (Turnbull et al., 2012; considered a 
flask sampling method in this study). 
2. These parameters may change over time; only the most current information is listed in the table. 


3. Temporal data selection applied in Case SEL and Case NA: a, only afternoon mean was used; n, only nighttime mean was used. 


Table 2. Types of observation sites used in each case. 
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Case Total Surface flask Surface in situ Tower Shipboard Aircraft 
Control case 154 82 33 10 2 27 
Case CT 90 67 14 8 1 0 
Case NF 61 60 0 0 1 0 
Case SEL 151 81 31 10 2 27 
Case NA 124 81 31 10 2 0 


Table 3. The number of data used in the inversion, mean bias, root-mean-square error (RMSE), and correlation coefficient (R): a) control case, Case CT, and Case NF; and b) 


control case, Case SEL, and Case NA. 


a) 
Case Number of data Bias (ppm) RMSE (ppm) R 
Control case 171,641 0.21 1.34 0.962 
Case CT 78,821 0.25 1.66 0.958 
Case NF 28,578 0.23 1.07 0.974 
b) 
Case Number of data Bias (ppm) RMSE (ppm) R 
Control case 171,641 0.21 1.34 0.962 
Case SEL 156,549 0.18 1.29 0.963 
Case NA 115,082 0.20 159 0.958 
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Figure 1. Schematic diagram of GELCA inverse modeling framework 
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Figure 2. Illustration of the inversion process employed in this study. The t indicates the 
time step on monthly basis. The modelled CO, concentrations z_ |, are sum of the back- 
ground concentrations z, and the presubtracted concentrations rae calculated by GELCA. 
In each inversion cycle, the modelled concentrations are compared to observations Z,, 
and the state vector s is optimized within a 3-month window. Optimized fluxes are incor- 
porated into the background concentration (z,’) before calculating for the next time step. 
The number of asterisks in the upper right of s shows how many times a set of monthly 
fluxes has been optimized previously from past cycles. The prime in the upper right of z, 
means that the z, has been updated. The dashed arrows mean monthly calculations by 
GELCA. 
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Figure 3. Definitions of the 64 regions used in the inversion 
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Figure 4. Map showing the observation site locations of the different site selec- 
tion cases: (a) control case (all symbols), Case SEL (green symbols removed), 
and Case NA (red symbols removed); (b) Case CT and (c) Case NF. Symbol 
shapes indicate the type of sampling: ©, surface discrete; +, surface continuous; 
Vv, ship; ©, aircraft. 
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Figure 5. Comparison of global annual mean posterior fluxes: (a) net, (b) land biosphere, 
and (c) ocean. (d) Multivariate ENSO Index (MEI) (Wolter and Timlin, 1993) for 2002-2011. 
Positive fluxes indicate emission and negative fluxes indicate uptake. In (a), the global 
annual mean atmospheric CO, growth rate is shown with net fluxes. The CO, growth rate 
in ppm are converted to the emission rates in Pg of carbon with a conversion factor of 2.12 
PgC ppm via simple molecular weight considerations. In (b) and (c), the global annual 
mean prior fluxes for land biosphere and ocean are shown, respectively. 
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Figure 6. Decadal mean (2002-2011) spatial distributions of posterior fluxes for (a—c) land 
and (e—g) ocean regions: (a, e) control case, (b, f) Case CT, (c, g) Case NF. Prior fluxes from 
the (d) land biosphere and (h) ocean. Positive fluxes indicate emission and negative fluxes 
indicate uptake. Note that the scale is different in land and oceanic fluxes. 
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Figure 7. Uncertainty reductions by region: (a) control case, (b) Case CT, 
and (c) Case NF. 
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Figure 8. Model-data mismatch for observation sites after inversion: (a) control 
case, (b) Case CT, (c) Case NF. The color and size of the colored circles indicate 
the bias and the RMSE, respectively. The size of the open circles indicates the prior 
uncertainty value. 
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Figure 9. Comparison of decadal mean (2002-2011) spatial distributions of posterior 
fluxes for the land biosphere (left panels) and ocean (right panels): (a) control case, (b) 
Case SEL, (c) Case NA. Positive fluxes indicate emission and negative fluxes indicate 
uptake. Note that the scale is different in land and oceanic fluxes. 
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Figure 10 (a) Prior and posterior land fluxes and (b) uncertainty reduction (UR) in 
tropical Asia (Region 33) in the control case, Case CT, and Case NF. Positive 
fluxes indicate emission and negative fluxes indicate uptake. 
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Figure 11. Differences between estimated annual mean regional CO, fluxes from the (a) land 
biosphere and (b) ocean derived with and without aircraft observations (control case — Case NA) 
during 2002—2011. The numbered regions are shown in Figure 3. Positive fluxes indicate emis- 
sion and negative fluxes indicate uptake. 
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Figure 12. Annually averaged CO2 distributions (ppm) at (a)990 hPa, (b)500 
hPa, (c)250 hPa calculated from each monthly pulse emission from Tropical 
Asia (Region 33) in 2008. 
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Figure 13. Monthly mean land biosphere posterior fluxes (control case - 
red; Case NA - green) and prior fluxes (VISIT - gray), averaged over 
2002-2011. Positive fluxes indicate emission and negative fluxes indicate 
uptake. 
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Figure $1. The footprint of 2-day backward trajectory simulation by 
FLEXPART in (a) January and (b) July 2009, for the ground obser- 
vation dataset used in the control case in this study (obspack_- 
co2_1_PROTOTYPE_v1.0.3_ 2013-01-29). 
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Figure S2. CO, time series at (a) Ochsenkopf (OXK) and (b) Pic du Midi 
(PDM) simulated by GELCA (red circle) and NIES-TM (blue circle), along 
with the observations (gray circle). 


